Troubleshooting the WEM

Troubleshooting the WEM
 
This appendix provides information on troubleshooting the following:
In addition to the above, instructions are also provided for capturing client and server logs. These are provided in the Capturing WEM Client Logs and Capturing WEM Server Logs using Script sections of this appendix.
note_smallImportant: Unless otherwise specified, all information provided in this chapter applies to both Solaris and Red Hat Enterprise Linux-based WEM systems.
Issues Pertaining to Installation
If you received the “ERROR: could not initialize interface awt - exception: java.lang.InternalError: Cannot connect to X11 window server using ':0.0' as the value of the DISPLAY variable.” message, the display settings of your terminal program may be incorrect, or Exceed is not running on the client machine.
The /tmp directory may be full.
Determine the status of the /var/tmp directory by entering the df -k command. If it is at or near capacity, choose another directory for the Host Base Directory parameter setting. This parameter can be set via the installation process.
Received “Unable to install Element Management System <version> over Element Management System: Installed product has newer version.” message when attempting installation.
Determine if WEM packages exist in the /var/sadm/pkg directory. These packages begin with “EMS”. If packages exist, remove them by entering the pkgrm -n EMS* command. Once they’ve been removed, reinstall the application.
Enter the ps -ef | grep server command to determine if any process instances from previous installations are running. If so, stop them using the instructions in the WEM Server Files and Operation chapter of this guide. Once stopped, start the processes for the current installation using the instructions in the same appendix.
Issues Related to Starting WEM
The log directory may have been accidentally deleted.
The “ServerPort” and/or “ServerIIOPPort” port values configured for the WEM are in use by other processes.
Verify that the Postgres database is running by entering the ps -ef | grep post command. If is not, follow the instructions in WEM Server Files and Operation chapter of this guide to start it.
Determine if the log directory exists in <ems_dir>/server (default directory) using the ls command to display the contents of the directory. If it is missing, create it using the mkdir command and stop and restart all WEM processes using the instructions in WEM Server Files and Operation chapter of this guide.
Enter the ps -ef | egrep “server|bulkstatparser|bulkstatserver|scriptsrv” command to determine if WEM server processes are running and, if they are, what directory did they originate from. If they’re different, stop the processes and restart the server from within the desired installation directory using the instructions in WEM Server Files and Operation chapter of this guide.
Determine if the “ServerPort” and/or “ServerIIOPPort” port numbers specified in the nms.cfg file (located in the <ems_dir>/server/etc directory by default) are already in use. The default “ServerPort” is 22222, and the default “ServerIIOPPort” is 15000. This can be determined by entering the netstat -a command which displays a list of all the process addresses and ports in use in “ipaddress.port” format. If they are in use, either stop the other processes or configure new values for these parameters.
A .s.PGSQL.5432.lock lock file is present in the /tmp directory prior to starting postgres.
Determine if a previous Postgres instance is still using system resources by entering the ipcs command. If it is, clear the resources by entering the ipcrm command.
NOTE: The table name in the above message is after the 'FROM' keyword.
Issues Related to Login
Received “Could not connect to server, destroying applet” message.
Verify that server processes are running using the information in WEM Server Files and Operation chapter of this guide.
Verify that IOR files are present; they are stored in the <ems_dir>/client/<ems-version-number>/ior directory by default. A number of files ending in .ior should be present. These files pertain to various functions supported by the WEM.
Edit the img.html file (located in the <ems_dir>/client directory (by default) to use fixed ports and open the required ports in the firewall. This requires the configuration of the “FIXED_PORT”, “FIXED_PORT_RANGE_START” and “FIXED_PORT_RANGE_END”.
If the user is “superuser”, the set_superuser_password script can be used to reset the “superuser” password to the default. If the user is not “superuser” then the administrator needs to be contacted to reset the user’s password.
Received “Java policy file is outdated or missing” message.
The .java.policy file is either missing from the user’s home directory on the client machine or it has expired.
Verify that the .java.policy file is present in your home directory. Refer to Preparing and Using the Client Workstation chapter of this guide for more information.
Copy the .java.policy file from the “Java Policy File” link provided in img.html file to your home directory. Ensure that no extension (i.e., .txt) is appended to the file.
note_smallImportant: All instances of the browser must be closed and restarted after the policy file has been updated.
Received “Server could not establish connection with client, therefore notifications will not work.” message.
Edit the img.html file (located in the <ems_dir>/client directory (by default) to use fix ports and open the required ports in the firewall. This requires the configuration of the “FIXED_PORT”, “FIXED_PORT_RANGE_START” and “FIXED_PORT_RANGE_END”.
Check the configuration of the no limit ConsecutiveFailLogin parameter in the ua.cfg file (located in the <ems_dir>/server/etc directory by default). If users are frequently locked out due to reaching the maximum limit, you may consider increasing the limit, or disabling the functionality. You may also consider reducing the amount of time the account is locked out by modifying the configuration of the no locked out LockOutInterval parameter also contained in the ua.cfg file.
Issues Related to the Web Browser
This is caused by the browser storing the.jar files for the newer version of the WEM client in its cache.
For JRE versions greater that 1.5: The Temporary Internet Files group in the General tab of the Java Control Panel should be used to disable caching.
For JRE versions greater that 1.6: The Temporary Internet Files group in the General tab of the Java Control Panel should be used to disable caching.
Check the java console and if you get exceptions like "Exception in thread "AWT-EventQueue-2" java.lang.NoClassDefFoundError", it could be the case that the browser cache is enabled on your workstation and needs cleanup.
Issues Pertaining to CORBA Communication
Ensure ICMP connectivity between the system and the WEM Server using the ping <wem_server_ip_address> command from the chassis’ command prompt. Refer to the Command Line Interface Reference for more information on using this command.
Verify that the ORBEM client identification on the chassis matches that configured on the WEM. The configuration of this parameter on the chassis can be determined by entering the show configuration | grep client CLI command. In WEM, check the ASID (Application Server ID), Port, and SSL-enabled flag (IIOP/SIOP) on the Modify IMG screen. Change these settings as needed.
Check the status of the ORBEM client on the chassis by executing the show orbem client id <client_id> command on the chassis. The “State” should be “Enabled”. If the “State” is “Disabled”, execute the activate client id <client_id> command in the ORBEM Configuration Mode and check the status again-- it should now be “Enabled”.
Verify that the configuration of the IIOP port on the chassis matches that configured for the WEM. The configuration of this parameter on the chassis can be determined by entering the show configuration | grep iiop-port. In WEM, check for the ASID (Application Server ID), Port, and SSL-enabled flag (IIOP/SIOP) on the Modify IMG screen. Change these settings as needed.
Verify that the IIOP transport parameter is enabled on the chassis by entering the show configuration | grep iiop-transport command. If it is not, enable using the instructions found in the System Administration and Configuration Guide.
Check if the SSL is enabled and/or enforced on the WEM. If the SSL is enabled, disable the IIOP transport on the chassis and set the value of IMG Port for the chassis such that it is identical to the SIOP port parameter configured on the chassis.
Received “Callbacks between server and client are not working. Screen cannot be invoked.” message.
Edit the img.html file (located in the <ems_dir>/client directory (by default) to use fix ports and open the required ports in the firewall. This requires the configuration of the "FIXED_PORT", "FIXED_PORT_RANGE_START" and "FIXED_PORT_RANGE_END".
Issues Related to Bulk Statistics
Sun Solaris WEM Servers only: Solaris operating system patches may need to be updated.
Verify that the FTP server process is running on the server by issuing the ps -ef | grep in.ftpd command. If it is not, start it.
Check the username and password used to ftp the bulkstats data from the chassis to Web Element Management server. Compare the “FTPUserName” and “FTPPassword” parameters in the nms.cfg file located in the <ems_dir>/server/etc directory by default to the names of administrative users with FTP privileges on the chassis.
Verify that the “destination” directory is configured in the bsparser.cfg file located in the <ems_dir>/server/etc directory by default.
Sun Solaris WEM Servers only: Verify that the latest Solaris operating system patches are installed. Refer to WEM Port and Hardware Information chapter of this guide for more information.
Ensure that the configuration the bulk statistics receiver on the managed system Executing the show bulkstats command on the chassis displays this information. The “Remote File Format” field should contain a valid directory on the WEM Server. (Also verify that this directory exists on the server.) The “Bulkstats Receivers” field should contain the IP address of the WEM Server.
Invalid “sample-interval” parameter configuration on the system.
Verify that the Bulkstats Server process is running by entering the ps -ef | grep bulkstatserver command. If it is not, execute the ./serv bulkstatserver start command from within the server directory (<ems_dir>/server by default).
Verify that the “sample-interval” parameter on the system is set to either “1” or “5”. The value can be determined by entering the show bulkstats command on the command line.
Make sure that “XMLDataEnable” parameter in the etc/bsserver.cfg file is set to “1” (enabled). If it is not, change the setting, save the file, and execute the ./serv bulkstatserver start command from within the server directory (<ems_dir>/server by default).
Received “No matching data found” error when fetching bulkstatistics reports.
Verify that the “sample-interval” parameter on the system is set to either “1” or “5”. The value can be determined by entering the show bulkstats command on the command line.
Verify that the Bulkstatistic Parser process is running by entering the ps -ef | grep bulkstatparser command. If it is not, execute the ./serv parserserver command from within the server directory (<ems_dir>/server by default).
Verify that the bulkstatistics format is compatible with the WEM. Refer to the bs.cfg file (located in the <ems_dir>/server/etc directory by default) for WEM bulkstatistic formatting.
For an existing installation, edit the “XMLDataEnable” parameter in the etc/bsserver.cfg file to be set to “1” (enabled). Once the setting is changed and the files is saved, execute the ./serv bulkstatserver start command from within the server directory (<ems_dir>/server by default).
note_smallImportant: If XMLFileType is set to 1, it will generate XML files irrespective of the other two mentioned configurables.
Check if OverrideLastAccessFlag in bsserver.cfg is enabled. If it is enabled, disable it and restart the bulkstatserver.
Reconfiguration of schema is not done after upgrade. Refer to the Reconfiguration of Bulkstat Schemas section of this guide for more information.
Issues Pertaining to Configuration Backup
Solaris WEM Servers only: Solaris operating system patches may need to be updated.
Verify that the FTP server process is running on the server by issuing the ps -ef | grep in.ftpd command. If it is not, start it.
Solaris WEM servers only: Verify that the latest Solaris operating system patches are installed. Refer to WEM Port and Hardware Information chapter of this guide for more information.
Issues Pertaining to Alarms
Verify that the SNMP target IP address and port number configured on the chassis match that of the WEM server. The SNMP target configuration on the chassis can be determined by entering the show snmp transports command. Check this information against the WEM server IP address (“ServerIpAddress”, specified in the nms.cfg file) and the SNMP port number (“SnmpTrapPort”, specified in the fm.cfg file) parameters. (Both of these files are located in the <ems_dir>/server/etc directory by default.)
Verify that the E-mail parameters are properly configured in the fm.cfg file (located in the <ems_dir>/server/etc directory by default).
Verify that the E-mail information configured in the Alarm Configuration dialog of the WEM is correct.
Verify that the Script Server is running by entering the ps -ef | grep scriptsrv command. If it is not, execute the ./serv scriptserver command from within the server directory (<ems_dir>/server by default).
Verify that the script file is located in the <ems_dir>/server/scripts directory (this is the default directory). If it is not, copy the script to that location.
Verify that the script can be executed by entering the ls -al command from within the directory in which the script is located.
Issues Pertaining to the Process Monitor (PSMON)
Verify that PSMON is running by entering the ps -ef | grep psmon command. If it is not, start it using the instructions located in WEM Process Monitor chapter of this guide.
The PSMON tries to restart the processes for "numretry" time within a duration of "tmintval" (refer to etc/psmon.cfg) per process. If the process still doesn't start, PSMON no longer monitors this process. Please check the <ems_dir>/log/watchdog.log for details. Try restarting the process using the serv script.
Issues Pertaining to Starting and Stopping EMS Processes
When starting or stopping a WEM process, the user receives the error message ld.so.1: httpd: fatal: libgcc_s.so.1: open failed: No such file or directory.
Log into the WEM Server as root and enter the crle command to view the current Default Library Path path. Here is an example where the crle path does not contain the EMS library path:
# crle –u –l $EMS_INSTALL_PATH/server/lib
Log into the WEM Server as root and verify that the ems.conf file is present in the /<ems_dir> /etc/ld.so.conf.d/ directory. To view the exsiting ems.conf file, enter the following commands:
If the ems.conf file is not present, enter the following commands to create it:
# cd /<ems_dir>/ems/server/lib
Verify that the /etc/ld.so.conf file contains the ld.so.conf.d/*.conf entry. If the entry is not present, add it to the /etc/ld.so.conf file.
Issues Pertaining to Java
Update the.java.policy file on any client machine that uses JRE 1.6.0_24 and above with the following line:
An error message displays to update the.java.policy file when the user invokes the WEM url. This is typical with newer WEM builds.
Download the .java.policy file again from the options located on the WEM splash screen.
The WEM Process Monitoring dialog shows “Could not connect to server. Screen will not be invoked”.
Library=C:\Program Files\Java\j2re1.4.2_04\bin\jsound.dll
Issues Pertaining to WEM Upgrade
Capturing WEM Client Logs
In the event that an issue exists that could not be solved using the information provided previously in this chapter, you may need to capture client logs for debugging purposes. This section provides information on how to utilize logging for WEM clients.
Step 1
Step 2
Step a
Right-mouse click on the Java(TM) 2 Platform icon in the status area (Windows System Tray).
Step b
Select Open Console from the menu.
Step 3
Step 4
The Java Console contains log messages that could be used for debugging the issue.
Capturing WEM Server Logs using Script
In the event additional troubleshooting assistance is required, debugging information can be collected using a script called getSupportDetails.pl. This script collects different log files and captures the output of certain system commands that aid in troubleshooting issues. This script is packaged with the WEM Server in the <EMS_INSTALL_DIR>/tools/supportdetails/ directory.
This script refers to an XML file to get the list of logs. This XML resides in the same directory as the script. Once executed, the script retrieves the contents of logs, files, folders, and output of certain commands and prepares a zipped file (/tmp/log/emssupportDetails.tar.gz), by default it is placed in /tmp/log directory.
 
Requirements:
Perl 5.8.5 and above is required for running the script. This is packaged with the WEM Server.
Apart from standard Perl modules (which are included in default installation of Perl), some additional modules are required for running the script. The list is as follows:
These modules are installed by default by the WEM application. Please ensure that the above mentioned modules are installed when using a different installation of Perl.
To run the script, go to the path where the script is present and enter:
./getSupportDetails.pl [--level=...] [--xmlfile=...] [--outputDir=] [--help]
Default: getSupportDetails.xml
Specifies the output directory for the emssupportDetails.tar.gz file if different from the default output directory (/tmp/).
For example:
./getSupportDetails --level=4 --xmlfile=/tmp/something.xml --outputDir=/mywemlogscripts
WEM IP Address Change Procedure
In the event the customer’s network evolves, the IP address of the WEM server might be required to change from the existing one. In order to change the WEM server IP address, use the following defined IP planning process:
Step 1
./serv stop
Step 2
ifconfig bge0 192.168.1.1 netmask 255.255.255.0 up
Step 3
vi nms.cfg
Replace the IP address with the new IP address in the modify serverIpAddress field. For example:
ServerIpAddress = 192.168.1.1
Save the file after making the appropriate changes.
Step 4
# cat /etc/hosts
<IP address> localhost
<new_IP_address> solaris_hostname
#
Step 5
./serv start
note_smallImportant: /etc/netmasks needs to be modified if the user is subnetting existing address and subsequently using a different network mask than the default one. If the netmask being used for a given IP address is a default one, then there is no need to modify this file.
 
 

Cisco Systems Inc.
Tel: 408-526-4000
Fax: 408-527-0883